Unsupervised outlier detection in multidimensional data
نویسندگان
چکیده
Abstract Detection and removal of outliers in a dataset is fundamental preprocessing task without which the analysis data can be misleading. Furthermore, existence anomalies heavily degrade performance machine learning algorithms. In order to detect an unsupervised manner, some novel statistical techniques are proposed this paper. The based on methods considering compactness other properties. newly ideas found efficient terms performance, ease implementation, computational complexity. two presented paper use transformation unidimensional distance space outliers, so irrespective data’s high dimensions, remain computationally inexpensive feasible. Comprehensive anomaly detection schemes paper, better than state-of-the-art when tested several benchmark datasets.
منابع مشابه
Very Fast Outlier Detection in Large Multidimensional Data Sets
Outliers are objects that do not comply with the general behavior of the data. Applications such as exploration in science databases need fast interactive tools for outlier detection in data sets that have unknown distributions, are large in size, and are in high dimensional space. Existing algorithms for outlier detection are too slow for such applications. We present an algorithm based on an ...
متن کاملOutlier Detection in Multivariate Data
The objective of this research is detection of outliers in multivariate data employing various distance measure, particularly using robust regression diagnosis technique. Several classical outlier identification methods are based on the sample mean and covariance matrix in general. But they do not always yield better result, as they themselves are affected by the outliers. Sometimes one outlier...
متن کاملOutlier detection in astronomical data
Astronomical data sets have experienced an unprecedented and continuing growth in the volume, quality, and complexity over the past few years, driven by the advances in telescope, detector, and computer technology. Like many other fields, astronomy has become a very data rich science. Information content measured in multiple Terabytes, and even larger, multi Petabyte data sets are on the horizo...
متن کاملOutlier Detection with Uncertain Data
In recent years, many new techniques have been developed for mining and managing uncertain data. This is because of the new ways of collecting data which has resulted in enormous amounts of inconsistent or missing data. Such data is often remodeled in the form of uncertain data. In this paper, we will examine the problem of outlier detection with uncertain data sets. The outlier detection probl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Big Data
سال: 2021
ISSN: ['2196-1115']
DOI: https://doi.org/10.1186/s40537-021-00469-z